Introduction
Fundamental properties of language
language segmentation
- segmented into smaller parts → combined sequentially
skewed frequency distribution
- power law distribution
↳ make language learnable
- arise from cultural transmission → repeatedly learnt by multiple generations
↓
fundamental challenge of language acquisition
- discovery of relevant parts of language
- in spoken language: no clear word boundaries
Segmentation
How segmentation works
statistical regularities
- serve as cues for word boundaries in speech
↓
transitional probabilities
- which syllable is likely to follow another?
- can cue word boundary
The transitional probabilities within these units will be higher than the transitional probabilities across unit boundaries.
Take the sequence pretty baby as an example, there are many different words that can appear after pretty (e.g., car, boy, hat, cat, and many more) but there are only few sounds that can appear after pre and result in a possible English sequence (premature, precise, and some more). This makes the transitional probability of syllables within a word (of the syllable ty given pre in our example) higher than that of syllables across word boundaries (ba given ty). A wealth of experimental evidence shows that infants, children and adults can track these transitional probabilities and use them to segment a novel continuous speech stream into its constituent parts (see 5 for a review).